Package hdlm: Regression Tables for High Dimensional Linear Model Estimation
نویسنده
چکیده
We present the R package hdlm, created to facilitate the study of high dimensional datasets. Our emphasis is on the production of regression tables and a class ‘hdlm’ for which new extensions can be easily written. We model our work on the functionality given for linear and generalized linear models from the functions lm and glm in the recommended package stats. Reasonable default options have been selected so that the package may be used immediately by anyone familiar with the low dimensional variants; however, a generic procedure for using alternative point estimators is also provided. Two techniques are given for constructing high dimensional regression tables. The first uses the the two-stage approach of Wasserman and Roeder (2009), with the generalization proposed by Meinshausen, Meier, and Bühlmann (2009) to increase robustness, in order to calculate high-dimensional p values. We introduce and implement a novel method for generalizing these p value methods to confidence intervals. The second technique constructs regression tables using a hierarchical Bayesian approach solved via Gibbs Sampling MCMC. In this article, we focus on design choices made in the package, relevant computational issues, and approaches to changing the default options.
منابع مشابه
Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملRobust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملPackage ‘ AICcmodavg ’ September 12 , 2013
Description This package includes functions to create model selection tables based on Akaike’s information criterion (AIC) and the second-order AIC (AICc), as well as their quasi-likelihood counterparts (QAIC, QAICc). Tables are printed with delta AIC and Akaike weights. The package also features functions to conduct classic model averaging (multimodel inference) for a given parameter of intere...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملThe flare package for high dimensional linear regression and precision matrix estimation in R
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, ℓ q Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). These methods exploit different nonsmooth loss functions to gain modeling exibility, estimation robustness, and tuning insensitiveness. The develo...
متن کامل